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DETERMINING THE FUNCTIONS AND INTERACTIONS OF 
PROTEINS BY COMPARATIVE ANALYSIS 

Related Applications 

The present application is a continuation-in-part application ("CIP") of Patent 
Convention Treaty (PCT) International Application Serial No: PCT/USOO/02246, filed in the 
U.S. receiving office on January 28, 2000, and this application claims the benefit of priority 
under 35 U.S.C. § 1 19(e) of U.S. Provisional Application Nos. 60/165,124, and 60/165,086, 
both filed November 12, 1999, and U.S. Provisional Application No. 60/179,531, filed February 
1, 2000. International Application Serial No: PCT/USOO/02246 claims the benefit of priority 
under 35 U.S.C. § 1 19(e) of U.S. Provisional Application Serial No. 60/1 17,844, filed January 
29, 1999, U.S. Provisional Application Serial No. 60/1 18,206, filed February 1, 1999, U.S. 
Provisional Application Serial No. 60/126,593, filed March 26, 1999, U.S. Provisional 
Applications Serial No. 60/134,093, filed May 14, 1999, and U.S. Provisional Application 
Serial No. 60/134,092, filed May 14, 1999. Each of the aforementioned applications is . 
explicitly incorporated herein by reference in their entirety and for all purposes. 

TECHNICAL FIELD 

This invention generally relates to genetics and microbiology. The invention 
provides novel methods to identify the function of and relationships between nucleic acid and 
protein sequences. The method is particularly useful for finding the identifying genes and 
polypeptides having potential therapeutic relevance in organisms, e.g., microorganisms, such 
as Mycobacterium tuberculosis. The invention also provides Mycobacterium tuberculosis 
genes and polypeptides found by these methods. These genes and polypeptides are useful as 
potential drug targets. 

BACKGROUND 

The determination of the functions of and relationships between nucleic acid 
and protein sequences has traditionally relied on either the study of homology and sequence 
identity with genes and proteins of known function or, in the absence of informative 
homology, laborious experimental work. The availability of many complete genome 
sequences has made it possible to develop new strategies for computational determination of 
protein functions. Several methods have been developed which can predict the general 



